Abstract: Analysis of gene expression data is a significant research field in DNA microarray research. Data mining methods have proven to be beneficial in understanding gene function; gene regulation, cellular processes and subtypes of cells.Microarrays have made it possible to cheaply gather large gene expression datasets. Biclustering has become established as a popular method for mining patterns in these datasets. Biclustering algorithms simultaneously cluster rows and columns of the data matrix; this approach is well suited to gene expression data because genes are not related across all samples, and vice versa. In the past decade many biclustering algorithms that specifically target gene expression data have been published. However, only a few are commonly used in bioinformatics pipelines. To break down the gene expression information, biclustering is generally used to assemble the articles into different groups in light of similarity. Elements in a similar group are as comparative as could be expected under the circumstances. Order Preserving Sub Matrices (OPSM) is highly used these days to analyze DNA microarray data. In this paper, we are proposing a method to find out all the possible OPSM’s in the data. Then we find out all common subsequences (ACS), and for this we are going to use FP Growth algorithm to mine frequent sequential patterns.

Keywords: OPSM, microarray data, gene expression analysis, Apriori algorithm, FP-Growth Algorithm.